Improving the ensemble speaker and speaking environment modeling approach by enhancing the precision of the online estimation process

نویسندگان

  • Yu Tsao
  • Chin-Hui Lee
چکیده

In this paper, we study methods to enhance the precision of the online estimation process of a recently proposed approach, ensemble speaker and speaking environment modeling (ESSEM), and therefore improve its overall performance. The ESSEM approach consists of two integral phases, offline and online. In the offline phase, an ensemble environment configuration is prepared by a large collection of acoustic models. Each set of acoustic models represents a particular environment. In the online phase, with speech data from the testing condition, we estimate a mapping function and use it to generate a new set of acoustic models for that particular testing condition. In our previous study, we have discussed the issues of the offline process and proposed algorithms to refine the environment configuration. In this paper, we first study different online mapping structures and compare their performances on a same environment configuration. Next, we propose a multiple clustering matching algorithm to further improve the overall performance of ESSEM. We tested ESSEM and its extensions on the full evaluation set of the Aurora2 connected digit recognition task. When using our best offline environment configuration along with a properly specified online estimation method, the ESSEM approach can achieve an average word error rate (WER) of 4.77%, corresponding to a WER reduction of 13.43% (from 5.51% WER to 4.77% WER) over the baseline result.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An ensemble modeling approach to joint characterization of speaker and speaking environments

We propose an ensemble modeling framework to jointly characterize speaker and speaking environments for robust speech recognition. We represent a particular environment by a super-vector formed by concatenating the entire set of mean vectors of the Gaussian mixture components in its corresponding hidden Markov model set. In the training phase we generate an ensemble speaker and speaking environ...

متن کامل

Machine learning algorithms in air quality modeling

Modern studies in the field of environment science and engineering show that deterministic models struggle to capture the relationship between the concentration of atmospheric pollutants and their emission sources. The recent advances in statistical modeling based on machine learning approaches have emerged as solution to tackle these issues. It is a fact that, input variable type largely affec...

متن کامل

Microsoft Word - IUCS_Full_final

Recently, we proposed an ensemble speaker and speaking environment modeling (ESSEM) approach to enhance the robustness of automatic speech recognition (ASR) under adverse conditions. The ESSEM framework comprises two phases, offline and online phases. In the offline phase, we prepare an environment structure that is formed by multiple sets of hidden Markov models (HMMs). Each HMM set represents...

متن کامل

Improving Accuracy in Intrusion Detection Systems Using Classifier Ensemble and Clustering

Recently by developing the technology, the number of network-based servicesis increasing, and sensitive information of users is shared through the Internet.Accordingly, large-scale malicious attacks on computer networks could causesevere disruption to network services so cybersecurity turns to a major concern fornetworks. An intrusion detection system (IDS) could be cons...

متن کامل

A Solution to the Problem of Extrapolation in Car Following Modeling Using an online fuzzy Neural Network

Car following process is time-varying in essence, due to the involvement of human actions. This paper develops an adaptive technique for car following modeling in a traffic flow. The proposed technique includes an online fuzzy neural network (OFNN) which is able to adapt its rule-consequent parameters to the time-varying processes. The proposed OFNN is first trained by an growing binary tree le...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008